Toward Hands-Off Crowdsourcing: Crowdsourced Entity Matching for the Masses
ثبت نشده
چکیده
Recent approaches to crowdsourcing entity matching (EM) are limited in that they crowdsource only parts of the EM workflow, requiring a developer to execute the remaining parts. Consequently, these approaches do not scale to the growing EM need at enterprises and crowdsourcing startups, and cannot handle scenarios where ordinary users (i.e., the masses) want to leverage crowdsourcing to match entities. In response, we propose the notion of hands-off crowdsourcing (HOC), which crowdsources the entire workflow of a task, thus requiring no developers. We show how HOC can represent a next logical direction for crowdsourcing research, scale up EM at enterprises and crowdsourcing startups, and open up crowdsourcing for the masses. We describe Corleone, a HOC solution for EM, which uses the crowd in all major steps of the EM process. Finally, we discuss the implications of our work to executing crowdsourced RDBMS joins, cleaning learning models, and soliciting complex information types from crowd workers.
منابع مشابه
Five Design Principles for Crowdsourced Policymaking: Assessing the Case of Crowdsourced Off-Road Traffic Law in Finland
This article reports a pioneering case study of a crowdsourced law-reform process in Finland. In the crowdsourcing experiment, the public was invited to contribute to the law-reform process by sharing their knowledge and ideas for a better policy. This article introduces a normative design framework of five principles for crowdsourced policymaking: inclusiveness, accountability, transparency, m...
متن کاملCrowdsourcing and annotating NER for Twitter #drift
We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (κ = 0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perf...
متن کاملMatching or Crashing? Personality-based Team Formation in Crowdsourcing Environments
“Does placing workers together based on their personality give better performance results in cooperative crowdsourcing settings, compared to non-personality based crowd team formation?” In this work we examine the impact of personality compatibility on the effectiveness of crowdsourced team work. Using a personality-based group dynamics approach, we examine two main types of personality combina...
متن کاملExperiments with crowdsourced re-annotation of a POS tagging data set
Crowdsourcing lets us collect multiple annotations for an item from several annotators. Typically, these are annotations for non-sequential classification tasks. While there has been some work on crowdsourcing named entity annotations, researchers have largely assumed that syntactic tasks such as part-of-speech (POS) tagging cannot be crowdsourced. This paper shows that workers can actually ann...
متن کاملMatching Drivers and Transportation Requests in Crowdsourced Delivery Systems
While the sales volume of e-commerce transactions is growing rapidly, the traditional concept of packages delivery has been challenged by innovative approaches such as crowdsourced delivery. Using individuals, for example commuters, to deliver packages from senders to receivers can provide several economic and environmental benefits. This paper illustrates an algorithm that automates and optimi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013